Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language
نویسندگان
چکیده
BACKGROUND Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. OBJECTIVE This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia's natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia's medical contributors, including year-over-year participation trends and editor demographics. METHODS Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. RESULTS Wikipedia's medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85.5% (100/117) had a university education. CONCLUSIONS Although Wikipedia has a considerable volume of multilingual medical content that is extensively read and well-referenced, the core group of editors that contribute and maintain that content is small and shrinking in size.
منابع مشابه
The Workshops of the Tenth International AAAI Conference on Web and Social Media
As a global, multilingual project, Wikipedia could serve as a repository for the world’s knowledge on an astounding range of topics. However, questions of participation and diversity among editors continue to be burning issues. We present the first targeted study of participants at Greek Wikipedia, with the goal of better understanding their motivations. Smaller Wikipedias play a key role in fo...
متن کاملAdvertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles
When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...
متن کاملParticipation and Scientific Collaboration in Persian Wikipedia
Background and Aim: This research studies the effective participation and scientific collaboration in Persian Wikipedia, from 2003-2012. Method: The library method has been used. Also, considering the objectives and the nature of subject, the research method is a descriptive-applied and during its implementation scientometric technique has been used. Excel and SPSS softwares have been used for...
متن کاملThermodynamic Principles in Social Collaborations
A thermodynamic framework is presented to characterize the evolution of efficiency, order, and quality in social content production systems, and this framework is applied to the analysis of Wikipedia. Contributing editors are characterized by their (creative) energy levels in terms of number of edits. We develop a definition of entropy that can be used to analyze the efficiency of the system as...
متن کاملDirections for Exploiting Asymmetries in Multilingual Wikipedia
Multilingual Wikipedia has been used extensively for a variety Natural Language Processing (NLP) tasks. Many Wikipedia entries (people, locations, events, etc.) have descriptions in several languages. These descriptions, however, are not identical. On the contrary, descriptions in different languages created for the same Wikipedia entry can vary greatly in terms of description length and inform...
متن کامل